A systematic review of non-coding RNA genes with differential expression profiles associated with autism spectrum disorders

Aims To identify differential expression of shorter non-coding RNA (ncRNA) genes associated with autism spectrum disorders (ASD). Background ncRNA are functional molecules that derive from non-translated DNA sequence. The HUGO Gene Nomenclature Committee (HGNC) have approved ncRNA gene classes with alignment to the reference human genome. One subset is microRNA (miRNA), which are highly conserved, short RNA molecules that regulate gene expression by direct post-transcriptional repression of messenger RNA. Several miRNA genes are implicated in the development and regulation of the nervous system. Expression of miRNA genes in ASD cohorts have been examined by multiple research groups. Other shorter classes of ncRNA have been examined less. A comprehensive systematic review examining expression of shorter ncRNA gene classes in ASD is timely to inform the direction of research. Methods We extracted data from studies examining ncRNA gene expression in ASD compared with non-ASD controls. We included studies on miRNA, piwi-interacting RNA (piRNA), small NF90 (ILF3) associated RNA (snaR), small nuclear RNA (snRNA), small nucleolar RNA (snoRNA), transfer RNA (tRNA), vault RNA (vtRNA) and Y RNA. The following electronic databases were searched: Cochrane Library, EMBASE, PubMed, Web of Science, PsycINFO, ERIC, AMED and CINAHL for papers published from January 2000 to May 2022. Studies were screened by two independent investigators with a third resolving discrepancies. Data was extracted from eligible papers. Results Forty-eight eligible studies were included in our systematic review with the majority examining miRNA gene expression alone. Sixty-four miRNA genes had differential expression in ASD compared to controls as reported in two or more studies, but often in opposing directions. Four miRNA genes had differential expression in the same direction in the same tissue type in at least 3 separate studies. Increased expression was reported in miR-106b-5p, miR-155-5p and miR-146a-5p in blood, post-mortem brain, and across several tissue types, respectively. Decreased expression was reported in miR-328-3p in bloods samples. Seven studies examined differential expression from other classes of ncRNA, including piRNA, snRNA, snoRNA and Y RNA. No individual ncRNA genes were reported in more than one study. Six studies reported differentially expressed snoRNA genes in ASD. A meta-analysis was not possible because of inconsistent methodologies, disparate tissue types examined, and varying forms of data presented. Conclusion There is limited but promising evidence associating the expression of certain miRNA genes and ASD, although the studies are of variable methodological quality and the results are largely inconsistent. There is emerging evidence associating differential expression of snoRNA genes in ASD. It is not currently possible to say whether the reports of differential expression in ncRNA may relate to ASD aetiology, a response to shared environmental factors linked to ASD such as sleep and nutrition, other molecular functions, human diversity, or chance findings. To improve our understanding of any potential association, we recommend improved and standardised methodologies and reporting of raw data. Further high-quality research is required to shine a light on possible associations, which may yet yield important information.


Background
ncRNA are functional molecules that derive from non-translated DNA sequence. The HUGO Gene Nomenclature Committee (HGNC) have approved ncRNA gene classes with alignment to the reference human genome. One subset is microRNA (miRNA), which are highly conserved, short RNA molecules that regulate gene expression by direct post-transcriptional repression of messenger RNA. Several miRNA genes are implicated in the development and regulation of the nervous system. Expression of miRNA genes in ASD cohorts have been examined by multiple research groups. Other shorter classes of ncRNA have been examined less. A comprehensive systematic review examining expression of shorter ncRNA gene classes in ASD is timely to inform the direction of research.

Methods
We extracted data from studies examining ncRNA gene expression in ASD compared with non-ASD controls. We included studies on miRNA, piwi-interacting RNA (piRNA), small NF90 (ILF3) associated RNA (snaR), small nuclear RNA (snRNA), small nucleolar RNA (snoRNA), transfer RNA (tRNA), vault RNA (vtRNA) and Y RNA. The following electronic databases were searched: Cochrane Library, EMBASE, PubMed, Web of Science, Psy-cINFO, ERIC, AMED and CINAHL for papers published from January 2000 to May 2022.

Introduction
Autistic people are thought to account for at least 1% of the global population [1]. Individuals with a diagnosis of autism have differences in social communication and are more likely to have intense interests [2][3][4]. People with autism belong within a spectrum of neurodiversity that is important for society and evolution [5]. For the purpose of this systematic review we have followed the established international diagnostic criteria and the corresponding nomenclature [6]. From herein we will use the associated terminology, autism spectrum disorder (ASD), although we acknowledge that different perspectives exist regarding language and terminology preferences [7][8][9]. The genomic landscape of ASD is complex [10], however a strong genetic aetiology is recognised [11,12] with twin studies estimating heritability between 70-90% [13,14]. Access to broad genomic testing is reshaping our understanding of ASD, which appears to encompass a collection of broad, heterogenous [15] and variable conditions with overlapping neurobehavioral phenotypes [16]. These may be considered on one hand as complex or syndromic when ASD symptomatology features alongside intellectual disability, facial systematic review exploring gene expression of other ncRNA. Secondly, a large proportion of early research in this field is from post-mortem samples from brain tissue [42,43]. These are important for discovery but may lack clinical translatability. To realise the potential of ASD ncRNA gene expression assays for biomarker use, we require an appreciation of the combined expression data from living patients with ASD from clinically available samples. To date there have been some narrative, discursive, selective or scoping reviews [25,42,[44][45][46][47][48] and just one recent systematic review that only examines miRNA expression associated with ASD that is missing some studies [41]. Finally, in view of the recent international nomenclature describing ncRNA with HGNC approved human genome annotation [26], we are keen to collate and present up to date and standardised ncRNA gene expression data associated with ASD. We acknowledge that there may be a paucity of evidence for classes of ncRNA other than miRNA, but demonstrating and delineating this clearly by systematic review is important to help shape future research directions.

Study eligibility criteria
The inclusion criteria were as follows: Population: Human subjects with a diagnosis of ASD compared with controls without ASD. Exposure: ncRNA gene expression profiles from biosamples measuring HGNC approved ncRNA genes or piRNA genes listed in piRBase v3.0.
Studies: Peer reviewed publications, conference abstracts or dissertations. The exclusion criteria were as follows: studies not published in English, duplicated data, non-human studies, review articles, hypothesis papers, narrative reviews, fact sheets and letters to the editor that did not present unique or new data, unpublished materials and studies published before 2000.

Procedure
Two reviewers independently screened the titles and abstracts to identify all eligible studies identified by the searches. Any discrepancies were adjudicated by a third reviewer. The reference lists of selected articles were used to identify additional papers for screening. The included studies were assessed using the Preferred Reporting Items for Systematic Review and Meta-Analysis (PRISMA) guidelines [62]. Data extraction took place and was recorded in a dedicated data extraction form generated using Microsoft Excel for further evaluation of study quality and data synthesis including functional enrichment analysis of the significant differentially expression miRNA genes. Raw data was retrieved from published papers, supplementary materials or by contacting the corresponding authors.

Data synthesis and quality assessment
We planned to perform meta-analysis of ncRNA gene expression using the statistical techniques employed by Zhu and Leung [63], including a random effects model [64] to examine differentially expressed ncRNA genes in ASD compared with controls. We expected between study heterogeneity and subgroup analysis were to be used to explore possible sources, including source of patients, source of control (such as healthy control or disease control), participant ethnicity, ncRNA profile (single ncRNA and multiple ncRNA) and sample specimen (blood, saliva, urine, cultured lymphoblastoid cells, fibroblast cells, neural tissues, and others); living or post-mortem. We planned to analyse the statistical heterogeneity of the meta-analysis by x-squared (x 2 )-based Q statistic test when I 2 (I-squared or I2) exceeded 50% or P < 0.1.
Receiver-operating characteristics (ROC) curves were planned to be generated with sensitivity, specificity and positive predictive values based on known assessments of participants with ASD or without ASD. The area under the curve (AUC) was planned to be calculated both overall and for any subgroup analysis. Statistical tests were intended to be two-sided, with P < 0.05 considered statistically significant. Functional enrichment analysis of statistically significant differentially expressed miRNA genes as determined by meta-analysis would be performed using DIANA-miRPath v3.0 [65] and executed using the online DIANA-microT-CDS webserver algorithm to examine Gene Ontology (GO) with 'categories union'. P-value and microT thresholds would be set at < 0.05 and 0.8, respectively with False Discovery Rate (FDR) correction applied. Targeted pathways and significance clusters will be generated and a related heatmap constructed.
We planned an assessment of publication bias

Studies identified for selection
The systematic review search strategy yielded 5250 publications, with 1221 being duplications. The titles and abstracts of 4029 papers were screened and 168 papers were assessed in full for eligibility. 48 studies were identified for inclusion in the systematic review for data extraction. This process is outlined along with reasons for exclusion in the PRISMA flow chart (Fig 1).

Summary of eligible studies
This systematic review has brought together the findings of 48 studies involving over 1800 individuals with ASD compared with over 1400 controls. The year of publication ranged from 2008 to 2021. ASD ncRNA gene expression studies have been conducted in numerous countries across the world, including Brazil, Bulgaria, China, Egypt, Iran, Italy, Japan, United Kingdom, and United States of America (USA). The most prolific country for publication was the USA with 12 studies. Considering all included studies, the diagnosis of ASD of study participants in 16 studies reported the use of both a validated assessment tool and Diagnostic and Statistical Manual of Mental Disorders (DSM-5) criteria. There were 14 studies that only reported the use of a validated assessment tool, the most common being the Autism Diagnostic Interview-Revised (ADI-R) and the Autism Diagnostic Observation Schedule (ADOS). Eight studies solely used The World Health Organisation (WHO) or DSM diagnostic criteria without a validated assessment tool and 10 studies did not state the method of ASD diagnosis. The vast majority of studies (N = 46) examined miRNA gene expression; 41 studies did so exclusively and 7 studies examined other classes of ncRNA, of which 5 studies also measured miRNA gene expression (including a genome wide study ncRNA expression study encompassing miRNA genomic loci). Fourteen studies used pre-selected candidate-driven ncRNA expression approaches, for example where specific miRNAs had been investigated, in contrast to 34 studies that investigated unselected or larger populations of ncRNA genes including those examined using genome wide approaches. Many of these studies went on to examine ('validate') a selected population of miRNA genes identified by an initial unselected approach such as microarray or from RNA-seq. Thirty-three studies reported ncRNA expression findings using tissue samples and laboratory methodologies that could feasibly be implemented into clinical practice (i.e., those from living individuals, with routine sampling methodology of easily obtainable tissue such as blood or saliva and routine laboratory work). These studies had a male to female ratio of participants of 3.5 to 1. There were 15 studies that exclusively reported

PLOS ONE
Systematic review of non-coding RNA differential expression profiles associated with autism spectrum disorders findings from studies with less or unfeasible clinical implementation possibilities (i.e., when samples derived from post-mortem brain tissue or studies from living individuals requiring specialist sampling procedures such as biopsies, or those with complex or time-consuming laboratory work such as cell culturing). These studies had a male to female ratio of participants of 4.8 to 1. There were two studies that examined ncRNA expression from both clinically feasible and unfeasible samples. Table 2 provides a summary of 33 studies describing methods feasible for clinical implementation. Of these, 29 reported ncRNA gene expression from peripheral blood and 4 reported from saliva samples. We found no studies exploring ncRNA gene expression from other bodily fluids such urine or sweat. Table 3 summarises the studies with less or unfeasible clinical implementation. From these 17 studies, 10 were from post-mortem brain tissue samples, five were from cultured lymphoblastoid cell lines, one was from reprogrammed induced pluripotent stem cell-derived neurons, and a further study reporting both olfactory mucosal stem cells and primary skin fibroblasts [71]. Two of these studies examined ncRNA gene expression from both clinically feasible and unfeasible samples [71,72], and therefore feature in both Tables 2  and 3. Table 4 provides an overview of the individual ncRNA genes (all of which are miRNA genes) that have been reported to have increased or decreased expression in ASD cohorts in more than one study. The individual miRNA genes are listed with the direction of expression change presented by broad tissue sample types: blood, saliva, cultured lymphoblastoid cells (unless otherwise specified) and post-mortem brain samples. The seven studies examining differential expression of ncRNA genes other than miRNA have been presented in a separate table (Table 5).

Non-coding RNA with differential expression in ASD
The systematic review revealed 64 miRNA genes with differential expression in more than one study (Table 4). Twenty-nine of these miRNA genes had differential expression in opposing directions. Four miRNA genes had differential expression in the same direction in the same tissue type in at least 3 separate studies. These were in bloods samples for miR-106b-5p [73-75] and miR-328-3p [73, 76, 77], which had increased and decreased expression, respectively. The other miRNA gene was miR-155-5p which had increased expression in post-mortem brain samples [78][79][80]. Finally, miR-146a-5p had consistent, increased differential expression across several different tissue types as reported in four studies [71, 81-83]. These were from saliva, primary skin fibroblasts, lymphoblastoid cell lines, olfactory mucosal cells and post-mortem brain samples from the pre-fontal cortex and temporal lobe, respectively. Seven research studies examined ncRNA gene expression, other than miRNA, in association with ASD (Table 5). From these studies, differential expression was reported in individual genes from ncRNA clas- . Differential expression of one or more snoRNA genes was reported by six studies, but no individual snoRNA gene or other individual ncRNA genes (excluding miRNA genes) had differential expression reported in more than one study.

Data synthesis and meta-analysis
Functional enrichment analysis using DIANA-miRPath v3.0 online interface [65] was performed with interrogation of the four key miRNA genes identified in this review (miR-106b-5p, miR-328-3p, miR-146a-5p and miR-155-5p) versus Gene Ontology (GO) categories. Clustering with the highest enrichment significance levels were seen in 'ion binding' and 'organelle    Originally plans were made for a series of statistical and publication bias analytical assessments as part of a meta-analysis, but these were not possible [64, 68, 69]. We added a field into the quality assessment related to statistical analysis given the complexities of analysing complex data sets and multiple testing in the included studies [120]. The quality assessment using adapted QUADAS-2 [70] is shown in S1 Table.

Discussion
We consider here our findings from 46 studies that examined miRNA gene expression and 7 studies examining other classes of ncRNA.

Differential expression of miRNA genes in ASD
Several miRNA genes have been reported to have differential expression in two or more studies (Table 4). Whilst this initially appears promising, many of these are in opposing directions. This may relate to tissue specificity of miRNA gene expression [71], type I errors related to high numbers of miRNA genes tested and/or statistical tests being performed [121] or reflect the heterogenous nature of ASD aetiology or the study populations examined [15,122]. Further issues around study quality and bias are considered later in the discussion. Only four miR-NAs had differential expression in the same direction and tissue type in at least three studies: miR-106b-5p, miR-146a-5p, miR-155-5p and miR-328-3p. Intriguingly, in addition to the studies in our systematic review, a further single case study examining genome-wide differential miRNA gene expression from the post-mortem prefrontal cortex of a single deceased individual with ASD compared with a non-ASD sibling control without ASD (i.e. ASD of N = 1) also

PLOS ONE
Systematic review of non-coding RNA differential expression profiles associated with autism spectrum disorders  found miR-106b-5p and miR-146a-5p were in their top six differentially expressed miRNA genes [123]. It is instructive to consider the 4 notable miRNA genes identified in our systematic review in more detail, although caution should be exercised given the possibility of selective research and/or reporting and high levels of potential bias found from our quality assessments.
Four notable miRNA genes with differential expression in ASD miR-106b-5p. It has previously been reported that miR-106b-5p has altered expression in schizophrenia [124]. The finding that ASD and childhood onset schizophrenia both share altered expression is under research scrutiny [125], although both these groups also have high associated rates of pathogenic copy number variants and brain trauma [126] and there is a long history of some diagnostic overlap [127]. miR-106b-5p has a wide influence on various biological processes including cancer [128,129] and in isolation is unlikely to demonstrate disease specificity. miR-146a-5. miR-146a-5p was found to have uniformly increased expression in our systematic review across a wide range of tissue types including saliva, primary skin fibroblasts, lymphoblastoid cell line, olfactory mucosal cells and post-mortem brain samples from the prefrontal cortex and temporal lobe, respectively [71, 78, 81-83]. One of these studies examined tissue and disease specificity of miR-146a-5p (with three other miRNA genes) and found no differential expression in peripheral blood mononuclear cells (PBMC) from a group of ASD patients compared to controls [71]. miR-146a-5p has also been implicated in a number of biological processes including regulation of the development of viral infections [130] and cancer tumour suppression [131], for example in the inhibition of both EGFR and NF-kB signalling and reduction of the metastatic potential of cancers [132]. miR-155-5p. miR-155-5p showed a degree of uniformity in our systematic review, with increased expression in the amygdala, prefrontal cortex and temporal cortex regions in three post-mortem studies [78-80] but with no significant differential expression found in dorsolateral prefrontal cortex [79]. miR-155-5p has been implicated in inflammatory processes [133,134] and the modulation of cancer [135]. miR-155-5p expression appears to be involved in impaired development of dendritic cells, B cells and T cells and is important for immune response [136,137]. Moreover, it was one of several differentially expressed miRNA genes associated with a basket of neurodegenerative diseases, including idiopathic Parkinson's disease, where miR- 155-5p has been reported to have increased expression [138].
miR-328-3p. In our systematic review, miR-328-3p was found to have decreased expression in peripheral blood samples in three studies examining serum [76, 77] and plasma [73], respectively [73, 76, 77] but a further study reported increased expression in peripheral blood [102]. miR-328-3p has been thought to have a role in cancer, whereby suppression is believed to impair stem cell function, a mechanism hypothesised to prevent ovarian cancer metastasis [139].
Functional enrichment analysis. The output of functional enrichment analysis by DIANA-miRPath v3.0 [65] with the four key miRNA genes identified in this systematic review (miR-106b-5p, miR-146a-5p, miR-155-5p and miR-328-3p) versus gene ontology categories identified the most significant levels of enrichment in 'ion binding' and 'organelle function' GO categories (S1 Fig). Ion binding is an interesting finding, given the theories of channelopathy dysregulation in the pathogenesis of ASD [140][141][142][143]. However, there are well articulated concerns related to the cautious interpretation of functional enrichment and pathway analysis of miRNA that have been raised within the miRNA research community [144][145][146][147]. For example, there have been suggestions that the results from standard analyses are biased by over-represented terms and may suffer from ascertainment bias for the most studied molecular pathways and be limited by selective coverage of annotated genes within a gene set [144]. Some solutions to these challenges have been proposed [144,145,147] but are beyond the scope of this review.

Other ncRNA with differential expression in ASD
Whilst the majority of papers identified in this systematic review examined miRNA gene expression, other ncRNA genes with differential expression were reported in seven papers including differential expression of snoRNA [ Table 5). One of these studies Salloum-Asfar and colleagues (2021) [73] was the first to report stable expression of piRNA, snoRNA, Y RNA and tRNA genes in plasma, a helpful attribute for further research. Two of the seven papers were published by the same research group [84, 88], and described overlapping ASD ncRNA 'diagnostic signatures' that derived from re-annotation and analysis of expression data from an external dataset with validation using recruited participants. Together these two studies described nine snRNA genes [84], one snoRNA gene [88] and one Y RNA pseudogene in overlapping ncRNA expression diagnostic models measured in blood (Table 5). Unfortunately the corresponding raw data, strength and direction of expression change, and how each ncRNA gene contributed to their 'signature formula' models were not clearly reported [84,88]. The small number of ncRNA gene expression studies in cohorts of individuals with a diagnosis of ASD is in itself an important finding to report, to help shape future research directions, given their cellular mechanisms and theoretical links with ASD. Each ncRNA class with reports of differential expression in ASD found in our review, have been discussed further, in turn.
Small nucleolar RNA. Six studies examined differential expression in snoRNA genes in ASD. snoRNA can be divided into three major classes: C/D box snoRNAs (SNORDs), H/ACA box snoRNAs (SNORAs) and small Cajal body-specific RNAs (scaRNAs) ( Table 1). snoRNAs accumulate in the nucleoli of the cell and have roles in post-transcriptional modification and maturation of ribosomal RNA and snRNA [56,85,148,149] and roles in mRNA processing and splicing [150]. There is interest in snoRNA splicing disruption affecting neuronal development and function [151][152][153]. snoRNA have been associated with a range of human diseases [154] including ASD, and are gathering interest [43,73,[84][85][86]. Differentially methylated genomic regions of paternal sperm samples have been associated with ASD-related phenotype at 12 months of age [22]. The paternal sperm genomic loci region exhibiting differential methylation in this study contains fifteen snoRNA genes within the SNORD-115 cluster, which lies within the Prader-Willi syndrome critical region on chromosome 15. Prader-Willi syndrome is an imprinting condition that can manifest with a neurobehavioral phenotype with aspects of ASD symptomatology [155].
Small nuclear RNA. Most snRNA are involved in the major and minor spliceosome complex to splice the introns from pre-messenger RNA [53]. snRNA and the related core spliceosomal U-snRNP complexes are associated with numerous diseases including those with neurological manifestations such as spinal muscular atrophy (SMA), amyotrophic lateral sclerosis and Burn-McKeown syndrome [156][157][158][159]. Some authors have proposed an association of with snRNA with ASD [84, 88, 160], including Zhou and colleagues (2019), identified in this review, who report an ASD-ncRNA 'diagnostic signature' in blood comprising entirely of snRNA genes [88].
Piwi-interacting RNA. piRNA are frequently considered with miRNA, given their comparable size, and overlapping molecular functions [51]. In contrast to miRNA, piRNA are predominantly expressed in germline cells and function to silence transposable elements and regulate gene expression through RNA cleavage and methylation mechanisms. The role of piRNA is increasingly being described in somatic cells, such as in the nervous system and they have been implicated in neurodevelopmental and neurodegenerative disorders [161]. Rett syndrome is an X-linked dominant neurodevelopmental condition affecting females caused by pathogenic variants in the MECP2 gene [162]. Rett Syndrome is characterised by developmental regression following a period of apparently normal development, an ASD neurobehavioural phenotype and repetitive hand movements. The MECP2 gene is responsible for binding to methylated genomic DNA and has epigenetic functions required for neuronal development [163]. Interestingly, MECP2 knockout mice have increased piRNA expression profiles in the cerebellum [164]. MECP2 also has roles related to miRNA biogenesis, miRNA binding and lncRNA interactions [163].
Y RNA. One study identified in this systematic review reported five Y RNA genes (RNY4P36, RNY4P6, RNY4, RNY4P25 and RNY4P18) with decreased plasma expression in ASD compared with controls [73]. The same study reported four other Y RNA genes with differential expression associated with 'more symptoms' of ASD, with increased expression of RNY4P29 and decreased expression of RNY3P1, RNY3 and RNY4P28, respectively. Whilst there were no other studies reporting Y RNAs, Cheng and colleagues (2020) included a single Y RNA pseudogene known as RNY1P11 within their ASD ncRNA diagnostic signature in blood [88], but had no HGNC approved Y RNA genes within their model. Y RNAs were first discovered in the serum of people with systemic lupus erythematosus (SLE), a multisystemic autoimmune condition that can involve the brain [165]. Y RNA have cellular roles related to DNA replication, RNA stability and cellular stress responses [59,60].
Other classes of ncRNA lacking ASD differential expression evidence. Whilst no differential expression findings were forthcoming from this systematic review in relation to vtRNA, tRNA and snaR, we have highlighted some interesting literature relevant to ASD, worthy of further discussion.
Vault RNA. vtRNA plays a role in neuronal synapse formation and so are of interest in ASD given postulated aetiologies such as altered neurone development including synapse formation [166]. vtRNA bind to and activate a mitogen-activated protein kinase (MEK) to amplify the RAS-MAPK signalling pathway [167]. There is emerging evidence associating RASopathies (a group of inherited disorders caused by pathogenic variants of genes encoding regulatory proteins within the RAS-MAPK signalling pathway) with an increased prevalence of ASD [168]. One such RASopathy is Legius syndrome, which interestingly has a murine model where the ASD-like neurobehavioral phenotype is ameliorated by MEK inhibitors [169]. Further work related to vtRNA expression in ASD could complement this research to support the possible clinical translation of ASD-related MEK inhibitor drug therapy [169].
Transfer RNA. tRNA genes are encoded for by both nuclear and mitochondrial genomes. The mitochondrial genome has been proposed as a genetic modifier for ASD [170] and theories related to mitochondrial dysfunction in ASD have been hypothesised [171]. The mitochondrial genome encodes 22 transfer RNA genes and harbours the majority of pathogenic variants that result in broad and disparate disorders [172]. One report demonstrated a mitochondrial tRNA variant within a single family that was attributed as causative for a heterogeneous group of neurological disorders where ASD was a feature [173].
Small NF90 (ILF3) associated RNA. snaR gene expression may also be worthy of further examination in ASD, given their abundant expression within the testis and discrete regions of the brain [52]. Evidence from meta-analysis reports that advanced paternal age as a risk factor for ASD [174], which may be related to increased rates of genomic and epigenomic abnormalities within the germline cells [175]. It is also interesting that polymorphisms of SNAR-I (one of twenty snaR genes), is associated with increased lateral ventricle volume [176], which is one of two neuroimaging distinguishing features (alongside increased Pallidum volume) found in a large ASD cohort that underwent high-resolution structural brain scans [177].

Limitations and quality assessments of studies
Quality of data and reporting. There are several limitations that need to be taken seriously both in interpreting the results from this systematic review and in planning for future research. The exact number of ASD participants from all included studies was difficult to ascertain as certain studies were not explicit in descriptions of study populations, and there were occasions where it was difficult to exclude some study population overlap [77,104,178]. The use of external datasets and biobank sample resources also made this challenging, with some instances where the same Gene Expression Omnibus (GEO) dataset was used (for example GSE18123 in three studies) [84,88,94]. Two of these studies were from one research group that also appeared to use the same internal datasets in both of their studies, but this was not readily apparent in their described methodologies [84,88]. Most studies use small sample sizes and several studies do not report how the diagnosis of ASD was established (e.g., whether they used validated measures). We identified studies that included participants with ASD present alongside confounding phenotypes for example, individuals with 'high-functioning' ASD [102], those who recruited both ASD and control participants from an allergy/immunology clinic [99], and individuals with high levels of consanguinity, epilepsy and dysmorphism [112], that may influence miRNA expression [179][180][181]. Participants were frequently recruited from convenience samples or clinic populations and many studies had a limited description of control groups with few or no assessments to characterise phenotype variations. These factors are further challenged by the heterogeneity of ASD and the use of small sample sizes [15,182]. We also recognised a large variation in the methods used to determine ncRNA gene expression and many studies omit important methodological details related to these.
Meta-analysis and data synthesis. Statistical methodological quality in the studies are highly variable with many instances of small sample sizes and studies using inappropriate statistical tests. It is unclear in some studies whether correction for multiple testing has been applied and, where stated, different methods have been used such as Bonferroni or Benjamini-Hochberg correction. For meta-analysis, we considered methods to combine p-values [119], such as Stouffer's method [183] that is generally preferred when different weights are attributed to the p-values being combined. However, it is not clear how the direction of differential expression (often presented as fold change) should be incorporated. Some authors recommend the removal of genes with conflicting differential expression, so that only the genes with the same fold change are combined [184] and others suggest that one-sided p-values can be used to take the direction of fold-change into account. When not specified, the p-values given are presumably two-sided but one-sided p-values are sometimes reported. We also observed different statistical tests, including t-tests, Mann-Whitney U-tests and Tukey's multiple comparison tests to provide the p-values. These were often reported as simple inequalities rather than precise values, making it unlikely that useful information could be extracted from their combination. High degrees of heterogeneity were apparent across studies with respect to participants, sample types and expression assays. It is well recognised that different cell types have tissue specific 'miRNomes' and comparing this ncRNA expression data therefore might not be appropriate [185]. Despite contacting several authors, we were not able to obtain full data sets in several cases. In summary, our planned strategy for meta-analysis and integration of the findings from different studies was not possible [118] because of the large variation in data presentation, availability, statistical analysis used and many instances of poor reporting.
Factors affecting ncRNA gene expression. The field of ncRNA gene expression studies is littered with challenges in the interpretation of findings. Disease or developmental states may not be the only factors altering ncRNA expression. Exercise [186], sleep [187], nutritional intake [188,189] and infection [190] are just some factors that may impact ncRNA expression. Interestingly, sleep [191], nutrition [192], bowel habit [193] and exercise [194] may be markedly different in people with ASD compared to neurotypical people, raising the prospect that ncRNA differential expression findings may be as a result of ASD and its patterns, lifestyles and associations rather than (or as well as) aetiological. This is currently unclear and so research methodologies should attempt to examine and control for this where possible. There are numerous ways that ncRNAs are deployed in biological processes. As in multifactorial models of ASD aetiology [195], the role of ncRNAs may also be multifaceted and interactive.
We also know that sample collection, RNA extraction, purification, storage, handling, and testing conditions can greatly impact ncRNA expression [196][197][198]. For example, the use of an EDTA anticoagulant appears to influence specific miRNA expression, particularly after longer EDTA exposure times [196]. In our systematic review, EDTA blood tubes were used in several studies [73,75,76,89,90,93,101,105] with only a few studies using PAXgene blood RNA tubes [94,102,104,110] and many studies omitted details about blood sample collection, including anticoagulant exposure timings. Quantity and quality of centrifuging in blood has also been shown to alter the proportion of intra and extracellular components that may demonstrate different miRNA expression properties [197]. Challenges related to ncRNA data normalisation approaches also support the need for standardisation [199]. Caution is also required for the interpretation of post-mortem samples. In life, hypoxia is known to change miRNA function and expression [200] and so it is not surprising that post-mortem miRNAs are altered through the process of death with degradation happening in different ways at different rates [201,202]. Post-mortem ncRNA gene expression studies therefore need to include supplementary tests to explore degradation to aid interpretation. In summary, the process of measuring ncRNA gene expression requires quality control and clear detailed reporting to allow comparison between studies for meaningful interpretation.
Differential gene expression in opposing directions. Our review findings of studies reporting miRNA genes with differential expression in opposite directions needs further consideration. Another systematic review in type two diabetes mellitus reported that two thirds of differentially expressed miRNA genes were found in opposite directions [63]. Whilst this may suggest poor methodologies or reporting bias we should be cautious about how we interpret this. Some miRNA genes appear to have greater tissue specificity than others [203]. In the context of cancer, opposing directions of miRNA differential gene expression in miR-125b is thought to represent oncogenic characteristics when expression is increased and loss of tumour suppressive functions when expression is decreased [204]. Differential expression in opposing directions of individual miRNA genes was observed in this systematic review on a population level, but also on an individual level [112]. There is evidence that direction of miRNA (and other ncRNA) differential expression may change with age [86] or over time and may respond to environmental exposures such as smoking [205] and alcohol [206]. Whilst numerous miRNA genes have been associated with neurodevelopmental or neurodegenerative diseases [138] there is still much work to be done to understand whether miRNA differential expression may play a role in aetiology or to the numerous other factors described above including a response to the condition itself.
Expression assays for ncRNA. Various technologies for measuring ncRNA expression levels have been used in the studies, each with different strengths and limitations [207]. Quantitative polymerase chain reaction assays (qPCR) are based on the amplification of target ncRNA genes of known sequence. Although qPCR assays are known for their high sensitivity and specificity, the sensitivity does depend on the target abundance and the efficiency of the amplification [208]. If there are closely related sequences to the target sequence, there is a risk of false amplification. The many different protocols, reagents, and analysis methods and lack of technical information led to recommendations for qPCR assay design and data reporting, or "minimum information for the publication of qPCR experiments" (MIQE) [209]. qPCR assays can be expensive as each target requires specific primers and probes and they are commonly used to validate gene expression changes identified by other methods, such as microarrays or Next-Generation Sequencing (NGS). Microarrays are cost-effective and have been widely used in ncRNA gene expression research. However, they may not be sensitive enough to detect expression of low-abundance ncRNA genes and can suffer from dynamic range issues which affect the quantification of highly abundant transcripts [210]. Microarray results can also be influenced by probe design bias, as the performance of the probes may vary depending on their sequence. Differences in hybridisation as well as normalisation issues mean that RNA sequencing is sometimes preferred [211]. NGS has revolutionised ncRNA research by allowing comprehensive profiling of ncRNAs. However, biases in library preparation methods, including at ligation, reverse transcription, and amplification steps, and sequencing errors, can all affect the accuracy of ncRNA identification and quantification [212]. Furthermore, NGS generates huge amounts of data, requiring advanced bioinformatics tools and computational resources for data analysis.
Implications for clinical practice. At the current time there are no implications for clinical practice that we could reliably draw from these results, with limited evidence to support ncRNA gene expression as biomarkers for ASD. The ncRNA genes with differential expression identified in this systematic review have all been implicated in several other diseases and biological processes and there is limited or no reporting of any high sensitivity and/or specificity scores or validation studies. There are also limited descriptions of phenotypes in the ASD groups. There is, however, enough promise to suggest that continuing to research in this field has potential to improve our understanding of mechanisms associated with neurodevelopmental differences such as ASD.
Implications for research. By contrast there are many implications for research to consider. The finding that there is limited research examining gene expression in classes of ncRNA other than miRNA is important to report. This shines a light on the omission in the research literature. Given that miRNA gene silencing occurs in many tissue types including in the developing brain [213], it is intriguing that four proteins critical for miRNA biogenesis [214] are encoded by genes associated with Mendelian disorders where ASD and overlapping neurobehavioral phenotypes are highly prevalent: DRCG8 (included within the deleted region in chromosome 22q11.2) [215], MECP2 (Rett syndrome) [216], FOXG1 [217] and FMR1 (Fragile X) [218,219]. As key regulators of gene expression, miRNA may have a role in modifying genetic variants demonstrating incomplete penetrance and variable expressivity [220]. This theory is interesting, considering the multiple examples of recurrent pathogenic CNVs associated with variable ASD risk [19].
Some standardisation is required to overcome the large variability in quality and reporting of ncRNA gene expression in ASD. Improved methodologies and reporting would greatly benefit the research endeavour. Alongside MIQE mentioned above, we recommend researchers work to the FAIR Guiding Principles for scientific data management and stewardship (2016) [221] to improve the findability, accessibility, interoperability, and reusability of ncRNA expression data in ASD and other ncRNA expression studies. This would provide the standardisation and authentication necessary for data to be reusable. Feature level extraction output (FLEO) files have been recommended as published gene lists (PGL data) and gene expression data matrices (GEDMs) have been deemed unsuitable for meta-analysis due to their dependence on the pre-processing used [118]. Sharing research data between research groups comes with challenges [222] and public sharing of raw data in biomedical microarray studies appears to be more likely for studies published in high impact journals and when lead authors are more experienced researchers [223,224]. The majority of journals and funders now have data sharing policies. National and international data protection laws restrict data sharing by genomic researchers but a number of initiatives have been developed to promote successful data sharing including those hosted by the European Molecular Biology Laboratory's European Bioinformatics Institute [225], the International Cancer Genome Consortium's project [226], the Pan-Cancer Analysis of Whole Genomes (PCAWG) [227] and the Human Cell Atlas [228]. The researchers involved in setting up PCAWG have called for an international code of conduct to overcome issues with data protection and provide guidelines for researchers [229].

Conclusion
The search for discrete genetic, immunological, metabolic, neurological/neurophysiological and behavioural associations with ASD continues [32]. Differential expression of ncRNA genes have shown much promise in various conditions and may be playing a role in the multifactorial aetiology of ASD. At present, no clear conclusions can be drawn from this systematic review for implementation into clinical practice. The key recommendations from our study are to improve research methodologies, reporting and data sharing in this field and to fund and deliver larger studies with more power that will increase the likelihood of being able to answer important questions.
Supporting information S1 Fig. Gene Ontology analysis heatmap using four most notable differentially expressed miRNA genes in ASD identified by this systematic review. DIANA-miRPath v3.0 online interface DIANA-microT-CDS was used to perform analysis of Gene Ontology Categories (x axis) versus the four key miRNA genes identified in this systematic review (miR-106b-5p, miR-328-3p, miR-146a-5p and miR-155-5p) (y axis). P-value and microT threshold were set at < 0.05 and 0.8, respectively and False Discovery Rate (FDR) applied. The heatmap shows the levels of enrichment as determined by Log(p values). (TIF) S1